27 research outputs found

    Machine learning applied to enzyme turnover numbers reveals protein structural correlates and improves metabolic models.

    Get PDF
    Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy. Here, we demonstrate that machine learning can successfully predict catalytic turnover numbers in Escherichia coli based on integrated data on enzyme biochemistry, protein structure, and network context. We identify a diverse set of features that are consistently predictive for both in vivo and in vitro enzyme turnover rates, revealing novel protein structural correlates of catalytic turnover. We use our predictions to parameterize two mechanistic genome-scale modelling frameworks for proteome-limited metabolism, leading to significantly higher accuracy in the prediction of quantitative proteome data than previous approaches. The presented machine learning models thus provide a valuable tool for understanding metabolism and the proteome at the genome scale, and elucidate structural, biochemical, and network properties that underlie enzyme kinetics

    Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes

    Get PDF
    Penetrance of variants in monogenic disease and clinical utility of common polygenic variation has not been well explored on a large-scale. Here, the authors use exome sequencing data from 77,184 individuals to generate penetrance estimates and assess the utility of polygenic variation in risk prediction of monogenic variants

    Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer.

    Get PDF
    To identify common alleles associated with different histotypes of epithelial ovarian cancer (EOC), we pooled data from multiple genome-wide genotyping projects totaling 25,509 EOC cases and 40,941 controls. We identified nine new susceptibility loci for different EOC histotypes: six for serous EOC histotypes (3q28, 4q32.3, 8q21.11, 10q24.33, 18q11.2 and 22q12.1), two for mucinous EOC (3q22.3 and 9q31.1) and one for endometrioid EOC (5q12.3). We then performed meta-analysis on the results for high-grade serous ovarian cancer with the results from analysis of 31,448 BRCA1 and BRCA2 mutation carriers, including 3,887 mutation carriers with EOC. This identified three additional susceptibility loci at 2q13, 8q24.1 and 12q24.31. Integrated analyses of genes and regulatory biofeatures at each locus predicted candidate susceptibility genes, including OBFC1, a new candidate susceptibility gene for low-grade and borderline serous EOC

    Genetic variation at CYP3A is associated with age at menarche and breast cancer risk : a case-control study

    Get PDF
    Abstract Introduction We have previously shown that a tag single nucleotide polymorphism (rs10235235), which maps to the CYP3A locus (7q22.1), was associated with a reduction in premenopausal urinary estrone glucuronide levels and a modest reduction in risk of breast cancer in women age ≀50 years. Methods We further investigated the association of rs10235235 with breast cancer risk in a large case control study of 47,346 cases and 47,570 controls from 52 studies participating in the Breast Cancer Association Consortium. Genotyping of rs10235235 was conducted using a custom Illumina Infinium array. Stratified analyses were conducted to determine whether this association was modified by age at diagnosis, ethnicity, age at menarche or tumor characteristics. Results We confirmed the association of rs10235235 with breast cancer risk for women of European ancestry but found no evidence that this association differed with age at diagnosis. Heterozygote and homozygote odds ratios (ORs) were OR = 0.98 (95% CI 0.94, 1.01; P = 0.2) and OR = 0.80 (95% CI 0.69, 0.93; P = 0.004), respectively (P trend = 0.02). There was no evidence of effect modification by tumor characteristics. rs10235235 was, however, associated with age at menarche in controls (P trend = 0.005) but not cases (P trend = 0.97). Consequently the association between rs10235235 and breast cancer risk differed according to age at menarche (P het = 0.02); the rare allele of rs10235235 was associated with a reduction in breast cancer risk for women who had their menarche age ≄15 years (ORhet = 0.84, 95% CI 0.75, 0.94; ORhom = 0.81, 95% CI 0.51, 1.30; P trend = 0.002) but not for those who had their menarche age ≀11 years (ORhet = 1.06, 95% CI 0.95, 1.19, ORhom = 1.07, 95% CI 0.67, 1.72; P trend = 0.29). Conclusions To our knowledge rs10235235 is the first single nucleotide polymorphism to be associated with both breast cancer risk and age at menarche consistent with the well-documented association between later age at menarche and a reduction in breast cancer risk. These associations are likely mediated via an effect on circulating hormone levels

    Network-level allosteric effects are elucidated by detailing how ligand-binding events modulate utilization of catalytic potentials

    Get PDF
    <div><p>Allosteric regulation has traditionally been described by mathematically-complex allosteric rate laws in the form of ratios of polynomials derived from the application of simplifying kinetic assumptions. Alternatively, an approach that explicitly describes all known ligand-binding events requires no simplifying assumptions while allowing for the computation of enzymatic states. Here, we employ such a modeling approach to examine the “catalytic potential” of an enzyme—an enzyme’s capacity to catalyze a biochemical reaction. The catalytic potential is the fundamental result of multiple ligand-binding events that represents a “tug of war” among the various regulators and substrates within the network. This formalism allows for the assessment of interacting allosteric enzymes and development of a network-level understanding of regulation. We first define the catalytic potential and use it to characterize the response of three key kinases (hexokinase, phosphofructokinase, and pyruvate kinase) in human red blood cell glycolysis to perturbations in ATP utilization. Next, we examine the sensitivity of the catalytic potential by using existing personalized models, finding that the catalytic potential allows for the identification of subtle but important differences in how individuals respond to such perturbations. Finally, we explore how the catalytic potential can help to elucidate how enzymes work in tandem to maintain a homeostatic state. Taken together, this work provides an interpretation and visualization of the dynamic interactions and network-level effects of interacting allosteric enzymes.</p></div

    MASSpy: Building, simulating, and visualizing dynamic biological models in Python using mass action kinetics.

    No full text
    Mathematical models of metabolic networks utilize simulation to study system-level mechanisms and functions. Various approaches have been used to model the steady state behavior of metabolic networks using genome-scale reconstructions, but formulating dynamic models from such reconstructions continues to be a key challenge. Here, we present the Mass Action Stoichiometric Simulation Python (MASSpy) package, an open-source computational framework for dynamic modeling of metabolism. MASSpy utilizes mass action kinetics and detailed chemical mechanisms to build dynamic models of complex biological processes. MASSpy adds dynamic modeling tools to the COnstraint-Based Reconstruction and Analysis Python (COBRApy) package to provide an unified framework for constraint-based and kinetic modeling of metabolic networks. MASSpy supports high-performance dynamic simulation through its implementation of libRoadRunner: the Systems Biology Markup Language (SBML) simulation engine. Three examples are provided to demonstrate how to use MASSpy: (1) a validation of the MASSpy modeling tool through dynamic simulation of detailed mechanisms of enzyme regulation; (2) a feature demonstration using a workflow for generating ensemble of kinetic models using Monte Carlo sampling to approximate missing numerical values of parameters and to quantify biological uncertainty, and (3) a case study in which MASSpy is utilized to overcome issues that arise when integrating experimental data with the computation of functional states of detailed biological mechanisms. MASSpy represents a powerful tool to address challenges that arise in dynamic modeling of metabolic networks, both at small and large scales
    corecore